Search CORE

80,324 research outputs found

Recommended from our members

Improved Draft Genome Sequence of Microbacterium sp. Strain LKL04, a Bacterial Endophyte Associated with Switchgrass Plants.

Author: Bokros Norbert
DeBolt Seth
Kyrpides Nikos C
Sahib Mohammad Radhi
Shapiro Nicole
Woyke Tanja
Xia Ye
Yang Piao
Publication venue: eScholarship, University of California
Publication date: 01/11/2019
Field of study

We report here the genome assembly and analysis of Microbacterium strain sp. LKL04, a Gram-positive bacterial endophyte isolated from switchgrass plants (Panicum virgatum) grown on a reclaimed coal-mining site. The 2.9-Mbp genome of this bacterium was assembled into a single contig encoding 2,806 protein coding genes

eScholarship - University of California

Data Mining of the Coffee Rust Genome

Author: Alvaro Gaitan
David Octavio Botero-Rozo
Diego M. Ria&#xf1
Marco Cristancho
Silvia Restrepo
William Giraldo
Publication venue
Publication date: 27/03/2012
Field of study

The genomes of nine isolates of _Hemileia vastatrix_, the causal agent of coffee leaf rust were sequenced by Illumina and 454. Quality control, cleaning and _de novo_ assemblies of data were performed. Since isolates were obtained from the field and it is not possible to produce axenic cultures of _H. vastatrix_, MEGAN software was used to evaluate contamination levels and to select contigs with fungal similarities. Mitochondrial contigs were identified and annotated by comparing this assembly against the _Puccinia_ genome. Furthermore, two transcriptomes from isolates of _H. vastatrix_ were assembled to complement the genomic data

Nature Precedings

Reference Based Genome Compression

Author: Chern Bobbie
Manolakos Alexandros
No Albert
Ochoa Idoia
Venkat Kartik
Weissman Tsachy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

DNA sequencing technology has advanced to a point where storage is becoming the central bottleneck in the acquisition and mining of more data. Large amounts of data are vital for genomics research, and generic compression tools, while viable, cannot offer the same savings as approaches tuned to inherent biological properties. We propose an algorithm to compress a target genome given a known reference genome. The proposed algorithm first generates a mapping from the reference to the target genome, and then compresses this mapping with an entropy coder. As an illustration of the performance: applying our algorithm to James Watson's genome with hg18 as a reference, we are able to reduce the 2991 megabyte (MB) genome down to 6.99 MB, while Gzip compresses it to 834.8 MB.Comment: 5 pages; Submitted to the IEEE Information Theory Workshop (ITW) 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Recommended from our members

A mass spectrometry-guided genome mining approach for natural product peptidogenomics.

Author: Cimermancic Peter
Dorrestein Pieter C
Fenical William
Fischbach Michael A
Kersten Roland D
Moore Bradley S
Nam Sang-Jip
Xu Yuquan
Yang Yu-Liang
Publication venue: eScholarship, University of California
Publication date: 01/10/2011
Field of study

Peptide natural products show broad biological properties and are commonly produced by orthogonal ribosomal and nonribosomal pathways in prokaryotes and eukaryotes. To harvest this large and diverse resource of bioactive molecules, we introduce here natural product peptidogenomics (NPP), a new MS-guided genome-mining method that connects the chemotypes of peptide natural products to their biosynthetic gene clusters by iteratively matching de novo tandem MS (MS(n)) structures to genomics-based structures following biosynthetic logic. In this study, we show that NPP enabled the rapid characterization of over ten chemically diverse ribosomal and nonribosomal peptide natural products of previously unidentified composition from Streptomycete bacteria as a proof of concept to begin automating the genome-mining process. We show the identification of lantipeptides, lasso peptides, linardins, formylated peptides and lipopeptides, many of which are from well-characterized model Streptomycetes, highlighting the power of NPP in the discovery of new peptide natural products from even intensely studied organisms

eScholarship - University of California

Strain prioritization and genome mining for enediyne natural products

Author: Ben Shen
Christoph Rader
Dong Yang
Filip Van Nieuwerburgh
Hindra
Huiming Ge
Ivana Crnovčičć
Jeffrey D. Rudolf
Jeremy R. Lohman
Li-Xing Zhao
Qihui Teng
Tingting Huang
Xiangcheng Zhu
Xiaohui Yan
Xiuling Li
Yannick Gansemans
Yanwen Duan
Yi Jiang
Yong Huang
Publication venue: 'American Society for Microbiology'
Publication date: 01/01/2016
Field of study

The enediyne family of natural products has had a profound impact on modern chemistry, biology, and medicine, and yet only 11 enediynes have been structurally characterized to date. Here we report a genome survey of 3,400 actinomycetes, identifying 81 strains that harbor genes encoding the enediyne polyketide synthase cassettes that could be grouped into 28 distinct clades based on phylogenetic analysis. Genome sequencing of 31 representative strains confirmed that each clade harbors a distinct enediyne biosynthetic gene cluster. A genome neighborhood network allows prediction of new structural features and biosynthetic insights that could be exploited for enediyne discovery. We confirmed one clade as new C-1027 producers, with a significantly higher C-1027 titer than the original producer, and discovered a new family of enediyne natural products, the tiancimycins (TNMs), that exhibit potent cytotoxicity against a broad spectrum of cancer cell lines. Our results demonstrate the feasibility of rapid discovery of new enediynes from a large strain collection. IMPORTANCE Recent advances in microbial genomics clearly revealed that the biosynthetic potential of soil actinomycetes to produce enediynes is underappreciated. A great challenge is to develop innovative methods to discover new enediynes and produce them in sufficient quantities for chemical, biological, and clinical investigations. This work demonstrated the feasibility of rapid discovery of new enediynes from a large strain collection. The new C-1027 producers, with a significantly higher C-1027 titer than the original producer, will impact the practical supply of this important drug lead. The TNMs, with their extremely potent cytotoxicity against various cancer cells and their rapid and complete cancer cell killing characteristics, in comparison with the payloads used in FDA-approved antibody-drug conjugates (ADCs), are poised to be exploited as payload candidates for the next generation of anticancer ADCs. Follow-up studies on the other identified hits promise the discovery of new enediynes, radically expanding the chemical space for the enediyne family

Ghent University Academic Bibliography

Directory of Open Access Journals

PubMed Central

Predicting Combinatorial Binding of Transcription Factors to Regulatory Elements in the Human Genome by Association Rule Mining

Author: Iyer Vishwanath R.
Miranker Daniel P.
Morgan Xochitl C.
Ni Sshulin
Publication venue
Publication date: 01/01/2007
Field of study

Cis-acting transcriptional regulatory elements in mammalian genomes typically contain specific combinations of binding sites for various transcription factors. Although some cisregulatory elements have been well studied, the combinations of transcription factors that regulate normal expression levels for the vast majority of the 20,000 genes in the human genome are unknown. We hypothesized that it should be possible to discover transcription factor combinations that regulate gene expression in concert by identifying over-represented combinations of sequence motifs that occur together in the genome. In order to detect combinations of transcription factor binding motifs, we developed a data mining approach based on the use of association rules, which are typically used in market basket analysis. We scored each segment of the genome for the presence or absence of each of 83 transcription factor binding motifs, then used association rule mining algorithms to mine this dataset, thus identifying frequently occurring pairs of distinct motifs within a segment. Results: Support for most pairs of transcription factor binding motifs was highly correlated across different chromosomes although pair significance varied. Known true positive motif pairs showed higher association rule support, confidence, and significance than background. Our subsets of high-confidence, high-significance mined pairs of transcription factors showed enrichment for co-citation in PubMed abstracts relative to all pairs, and the predicted associations were often readily verifiable in the literature. Conclusion: Functional elements in the genome where transcription factors bind to regulate expression in a combinatorial manner are more likely to be predicted by identifying statistically and biologically significant combinations of transcription factor binding motifs than by simply scanning the genome for the occurrence of binding sites for a single transcription factor.NIAAA Alcohol Training GrantNational Science FoundationCellular and Molecular Biolog

Crossref

PubMed Central

Texas ScholarWorks

PROPHECY—a database for high-resolution phenomics

Author: Blomberg Anders
Ericson Elke
Fernandez-Ricaud Luciano
Kemp Graham J. L.
Nerman Olle
Pylvänäinen Ilona
Warringer Jonas
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

The rapid recent evolution of the field phenomics—the genome-wide study of gene dispensability by quantitative analysis of phenotypes—has resulted in an increasing demand for new data analysis and visualization tools. Following the introduction of a novel approach for precise, genome-wide quantification of gene dispensability in Saccharomyces cerevisiae we here announce a public resource for mining, filtering and visualizing phenotypic data—the PROPHECY database. PROPHECY is designed to allow easy and flexible access to physiologically relevant quantitative data for the growth behaviour of mutant strains in the yeast deletion collection during conditions of environmental challenges. PROPHECY is publicly accessible at http://prophecy.lundberg.gu.se

Crossref

PubMed Central

Chalmers Research

Chalmers Publication Library